skip to main content


Search for: All records

Creators/Authors contains: "Brisk, Philip"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available July 1, 2024
  2. Abstract

    Template matching has proven to be an effective method for seismic event detection, but is biased toward identifying events similar to previously known events, and thus is ineffective at discovering events with non‐matching waveforms (e.g., those dissimilar to existing catalog events). In principle, this limitation can be overcome by cross‐correlating every segment (possible template) of a seismogram with every other segment to identify all similar event pairs, but doing so has been previously considered computationally infeasible for long time series. Here we describe a method, called the ‘Matrix Profile’ (MP), a “correlate everything with everything” calculation that can be efficiently and scalably computed. The MP returns the maximum value of the correlation coefficient of every sub‐window of continuous data with every other sub‐window, as well as the best‐correlated sub‐window location. Here we show how MP methods can obtain valuable results when applied to months and years of continuous seismic data in both local and global case studies. We find that the MP can identify many new events in Parkfield, California seismicity that are not contained in existing event catalogs and that it can efficiently find clusters of similar earthquakes in global seismic data. Either used by itself, or as a starting point for subsequent template matching calculations, the MP is likely to provide a useful new tool for seismology research.

     
    more » « less
  3. With the proliferation of low-cost sensors and the Internet of Things, the rate of producing data far exceeds the compute and storage capabilities of today’s infrastructure. Much of this data takes the form of time series, and in response, there has been increasing interest in the creation of time series archives in the last decade, along with the development and deployment of novel analysis methods to process the data. The general strategy has been to apply a plurality of similarity search mechanisms to various subsets and subsequences of time series data in order to identify repeated patterns and anomalies; however, the computational demands of these approaches renders them incompatible with today’s power-constrained embedded CPUs. To address this challenge, we present FA-LAMP, an FPGA-accelerated implementation of the Learned Approximate Matrix Profile (LAMP) algorithm, which predicts the correlation between streaming data sampled in real-time and a representative time series dataset used for training. FA-LAMP lends itself as a real-time solution for time series analysis problems such as classification. We present the implementation of FA-LAMP on both edge- and cloud-based prototypes. On the edge devices, FA-LAMP integrates accelerated computation as close as possible to IoT sensors, thereby eliminating the need to transmit and store data in the cloud for posterior analysis. On the cloud-based accelerators, FA-LAMP can execute multiple LAMP models on the same board, allowing simultaneous processing of incoming data from multiple data sources across a network. LAMP employs a Convolutional Neural Network (CNN) for prediction. This work investigates the challenges and limitations of deploying CNNs on FPGAs using the Xilinx Deep Learning Processor Unit (DPU) and the Vitis AI development environment. We expose several technical limitations of the DPU, while providing a mechanism to overcome them by attaching custom IP block accelerators to the architecture. We evaluate FA-LAMP using a low-cost Xilinx Ultra96-V2 FPGA as well as a cloud-based Xilinx Alveo U280 accelerator card and measure their performance against a prototypical LAMP deployment running on a Raspberry Pi 3, an Edge TPU, a GPU, a desktop CPU, and a server-class CPU. In the edge scenario, the Ultra96-V2 FPGA improved performance and energy consumption compared to the Raspberry Pi; in the cloud scenario, the server CPU and GPU outperformed the Alveo U280 accelerator card, while the desktop CPU achieved comparable performance; however, the Alveo card offered an order of magnitude lower energy consumption compared to the other four platforms. Our implementation is publicly available at https://github.com/aminiok1/lamp-alveo. 
    more » « less
  4. Mattoli, Virgilio (Ed.)
    Pneumatically-actuated soft robots have advantages over traditional rigid robots in many applications. In particular, their flexible bodies and gentle air-powered movements make them more suitable for use around humans and other objects that could be injured or damaged by traditional robots. However, existing systems for controlling soft robots currently require dedicated electromechanical hardware (usually solenoid valves) to maintain the actuation state (expanded or contracted) of each independent actuator. When combined with power, computation, and sensing components, this control hardware adds considerable cost, size, and power demands to the robot, thereby limiting the feasibility of soft robots in many important application areas. In this work, we introduce a pneumatic memory that uses air (not electricity) to set and maintain the states of large numbers of soft robotic actuators without dedicated electromechanical hardware. These pneumatic logic circuits use normally-closed microfluidic valves as transistor-like elements; this enables our circuits to support more complex computational functions than those built from normally-open valves. We demonstrate an eight-bit nonvolatile random-access pneumatic memory (RAM) that can maintain the states of multiple actuators, control both individual actuators and multiple actuators simultaneously using a pneumatic version of time division multiplexing (TDM), and set actuators to any intermediate position using a pneumatic version of analog-to-digital conversion. We perform proof-of-concept experimental testing of our pneumatic RAM by using it to control soft robotic hands playing individual notes, chords, and songs on a piano keyboard. By dramatically reducing the amount of hardware required to control multiple independent actuators in pneumatic soft robots, our pneumatic RAM can accelerate the spread of soft robotic technologies to a wide range of important application areas. 
    more » « less
  5. null (Ed.)
    With the proliferation of low-cost sensors and the Internet-of-Things (IoT), the rate of producing data far exceeds the compute and storage capabilities of today’s infrastructure. Much of this data takes the form of time series, and in response, there has been increasing interest in the creation of time series archives in the last decade, along with the development and deployment of novel analysis methods to process the data. The general strategy has been to apply a plurality of similarity search mechanisms to various subsets and subsequences of time series data in order to identify repeated patterns and anomalies; however, the computational demands of these approaches renders them incompatible with today’s power-constrained embedded CPUs. To address this challenge, we present FA-LAMP, an FPGA-accelerated implementation of the Learned Approximate Matrix Profile (LAMP) algorithm, which predicts the correlation between streaming data sampled in real-time and a representative time series dataset used for training. FA-LAMP lends itself as a real-time solution for time series analysis problems such as classification and anomaly detection, among others. FA-LAMP provides a mechanism to integrate accelerated computation as close as possible to IoT sensors, thereby eliminating the need to transmit and store data in the cloud for posterior analysis. At its core, LAMP and FA-LAMP employ Convolution Neural Networks (CNNs) to perform prediction. This work investigates the challenges and limitations of deploying CNNs on FPGAs when using state-of-the-art commercially-supported frameworks built for this purpose, namely, the Xilinx Deep Learning Processor Unit (DPU) overlay and the Vitis AI development environment. This work exposes several technical limitations of the DPU, while providing a mechanism to overcome these limits by attaching our own hand-optimized IP block accelerators to the DPU overlay. We evaluate FA-LAMP using a low-cost Xilinx Ultra96-V2 FPGA, demonstrating performance and energy improvements of more than an order of magnitude compared to a prototypical LAMP deployment running on a Raspberry Pi 3. Our implementation is publicly available at https://github.com/fccm2021sub/fccm-lamp. 
    more » « less
  6. null (Ed.)
    This paper introduces BioScript , a domain-specific language (DSL) for programmable biochemistry that executes on emerging microfluidic platforms. The goal of this research is to provide a simple, intuitive, and type-safe DSL that is accessible to life science practitioners. The novel feature of the language is its syntax, which aims to optimize human readability; the technical contribution of the paper is the BioScript type system. The type system ensures that certain types of errors, specific to biochemistry, do not occur, such as the interaction of chemicals that may be unsafe. Results are obtained using a custom-built compiler that implements the BioScript language and type system. 
    more » « less
  7. This paper describes an FPGA-based vector engine to accelerate the bootstrapping procedure of Fast Fully Homomorphic Encryption over the Torus (TFHE), a popular and high-performance fully homomorphic encryption scheme. Most TFHE bootstraping comprises many matrix-vector operations that are implemented using Torus polynomials, which are not efficiently implemented on today's standard arithmetic hardware. Our implementation achieves linear performance scaling with up to 16 vector lanes. Future work will switch to an FFT-based polynomial multiplication scheme and switch to larger FPGA parts to accommodate more vector lanes. 
    more » « less